Proto-Indo-European Lexicon: The Generative Etymological Dictionary of Indo-European Languages
نویسنده
چکیده
Proto-Indo-European Lexicon (PIE Lexicon) is the generative etymological dictionary of Indo-European languages. The reconstruction of Proto-Indo-European (PIE) is obtained by applying the comparative method, the output of which equals the Indo-European (IE) data. Due to this the Indo-European sound laws leading from PIE to IE, revised in Pyysalo 2013, can be coded using Finite-State Transducers (FST). For this purpose the foma finite-state compiler by Mans Hulden (2009) has been chosen in PIE Lexicon. At this point PIE Lexicon generates data of some 120 Indo-European languages with an accuracy rate of over 99% and is therefore the first dictionary in the world capable of generating (predicting) its data entries by means of digitized sound laws.
منابع مشابه
Problems in-Computerized Historical Linguistics: the Old Cornish Lexicon
This work represents an attempt to utilize the computer in solving problems in historical linguistics. The corpus upon which it operates is not a language but a recently published etymological dictionary of Old Cornish. 1 Any observations regarding the scarcity or inaccuracy of the data utilized are, therefore, irrelevant, as far as the present paper is concerned. As the dictionary in question ...
متن کامل2. IE & IEs
PROTO-INDO-EUROPEAN is the traditional name given to the ancestor language of the Indo-European family that is spread from Iceland to Chinese Turkestan and from Scandinavia to the Near East. A PROTO-LANGUAGE (Gk. prõtos ‘first’) refers to the earliest form of a language family presupposed by all of its descendants. There will forever be major gaps in the reconstruction of proto-languages, but a...
متن کاملDictionary Organization in Linguistic Automaton for Oriental Languages
The central problem for natural language processing (NLP) systems dealing with non-Indo-European (“Oriental”) languages is how to develop automatic dictionaries (AD) and dictionary entry (DE) schemes. The point is that the need of Oriental language industrial NLP has been felt for some time. It has acquired additional urgency with the rapid growth of business contacts between Russia and the nat...
متن کاملLearning an English-chinese Lexicon from a Parallel Corpus
We report experiments on automatic learning of an English-Chinese translation lexicon, through statistical training on a large parallel corpus. The learned vocabulary size is nontrivial at 6,517 English words averaging 2.33 Chinese translations per entry, with a manuallyfiltered precision of 95.1% and a single-most-probable precision of 91.2%. We then introduce a significance filtering method t...
متن کاملPhonaesthemic and Etymological effects on the Distribution of Senses in Statistical Models of Semantics
This paper uses methods based on corpus statistics and synonymy to explore the role language history and sound/form relationships play in conceptual organization through a case study relating the phonaestheme glto its prevalent Proto-Indo European root, *ghel. The results of both methods point to a strong link between the phonaestheme and the historical root, suggesting that the lineage of a la...
متن کامل